Search CORE

11 research outputs found

Recommended from our members

Reliably prototyping large SoCs using FPGA clusters

Author: Fox PJ
Markettos AT
Moore SW
Publication venue: 2014 9th International Symposium on Reconfigurable and Communication-Centric Systems-on-Chip, ReCoSoC 2014
Publication date: 01/01/2014
Field of study

Apollo (Cambridge)

Interconnect for commodity FPGA clusters: Standardized or customized?

Author: Fox PJ
Markettos AT
Moore AW
Moore SW
Publication venue: Conference Digest - 24th International Conference on Field Programmable Logic and Applications, FPL 2014
Publication date: 01/09/2014
Field of study

Crossref

Apollo (Cambridge)

General hardware multicasting for fine-grained message-passing architectures

Author: Beaumont JR
Brown A
Bytheway T
Fleming S
Markettos AT
Moore SW
Naylor M
Thomas D
Vousden M
Publication venue: Proceedings - 29th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2021
Publication date: 01/01/2021
Field of study

Manycore architectures are increasingly favouring message-passing or partitioned global address spaces (PGAS) over cache coherency for reasons of power efficiency and scalability. However, in the absence of cache coherency, there can be a lack of hardware support for one-to-many communication patterns, which are prevalent in some application domains. To address this, we present new hardware primitives for multicast communication in rack-scale manycore systems. These primitives guarantee delivery to both colocated and distributed destinations, and can capture large unstructured communication patterns precisely. As a result, reliable multicast transfers among any number of software tasks, connected in any topology, can be fully offloaded to hardware. We implement the new primitives in a research platform consisting of 50K RISC-V threads distributed over 48 FPGAs, and demonstrate significant performance benefits on a range of applications expressed using a high-level vertex-centric programming model

Southampton (e-Prints Soton)

Apollo (Cambridge)

Recommended from our members

Conquering the complexity mountain: Full-stack computer architecture teaching with FPGAs

Author: Gavrila VA
Jones BD
Markettos AT
Moore SW
Spliet R
Publication venue: 2016 11th European Workshop on Microelectronics Education, EWME 2016
Publication date: 01/01/2016
Field of study

Modern computer systems are exceedingly complex, and increasingly so. This makes it challenging for students with no background in computer systems to climb the mountain of 40 years of design, particularly within a constrained teaching timetable. Through the medium of FPGAs, we have designed an 8-week course to take students from basic digital electronics through to processor design, modern software tools, applications, system-on-chip integration and electronics manufacturing. We recount our experiences with rapidly bringing students up to speed with the modern world of computing systems, and some of the lessons we, as course designers, were taught by the process

Apollo (Cambridge)

Recommended from our members

Efficient tagged memory

Author: Baldwin J
Bradbury A
Chisnall D
Davis B
Gudka K
Joannou A
Kovacsics R
Markettos AT
Mazzinghi A
Moore SW
Napierala E
Neumann PG
Richardson A
Roe M
Son S
Watson RNM
Woodruff J
Xia H
Publication venue: Proceedings - 35th IEEE International Conference on Computer Design, ICCD 2017
Publication date: 01/01/2017
Field of study

We characterize the cache behavior of an in-memory tag table and demonstrate that an optimized implementation can typically achieve a near-zero memory traffic overhead. Both industry and academia have repeatedly demonstrated tagged memory as a key mechanism to enable enforcement of powerful security invariants, including capabilities pointer integrity, watchpoints, and information-flow tracking. A single-bit tag shadowspace is the most commonly proposed requirement, as one bit is the minimum metadata needed to distinguish between an untyped data word and any number of new hardware-enforced types. We survey various tag shadowspace approaches and identify their common requirements and positive features of their implementations. To avoid non-standard memory widths, we identify the most practical implementation for tag storage to be an in-memory table managed next to the DRAM controller. We characterize the caching performance of such a tag table and demonstrate a DRAM traffic overhead below 5\% for the vast majority of applications. We identify spatial locality on a page scale as the primary factor that enables surprisingly high table cache-ability. We then demonstrate tag-table compression for a set of common applications. A hierarchical structure with elegantly simple optimizations reduces DRAM traffic overhead to below 1\% for most applications. These insights and optimizations pave the way for commercial applications making use of single-bit tags stored in commodity memory

Apollo (Cambridge)

Fast Protection-Domain Crossing in the CHERI Capability-System Architecture

Author: Anderson J
Chisnall D
Dave NH
Davis B
Gudka K
Joannou A
Laurie B
Markettos AT
Maste E
Moore SW
Murdoch SJ
Neumann PG
Norton RM
Roe M
Rothwell C
Son SD
Vadera M
Watson RNM
Woodruff J
Publication venue: IEEE Micro
Publication date: 01/01/2016
Field of study

Capability Hardware Enhanced RISC Instructions (CHERI) supplement the conventional memory management unit (MMU) with instruction-set architecture (ISA) extensions that implement a capability system model in the address space. CHERI can also underpin a hardware-software object-capability model for scalable application compartmentalization that can mitigate broader classes of attack. This article describes ISA additions to CHERI that support fast protection-domain switching, not only in terms of low cycle count, but also efficient memory sharing with mutual distrust. The authors propose ISA support for sealed capabilities, hardware-assisted checking during protection-domain switching, a lightweight capability flow-control model, and fast register clearing, while retaining the flexibility of a software-defined protection-domain transition model. They validate this approach through a full-system experimental design, including ISA extensions, a field-programmable gate array prototype (implemented in Bluespec SystemVerilog), and a software stack including an OS (based on FreeBSD), compiler (based on LLVM), software compartmentalization model, and open-source applications.This work is part of the CTSRD and MRC2 projects sponsored by the Defense Advanced Research Projects Agency (DARPA) and the Air Force Research Laboratory (AFRL), under contracts FA8750-10-C-0237 and FA8750-11-C-0249. We also acknowledge the Engineering and Physical Sciences Research Council (EPSRC) REMS Programme Grant [EP/K008528/1], the EPSRC Impact Acceleration Account [EP/K503757/1], EPSRC/ARM iCASE studentship [13220009], Microsoft studentship [MRS2011-031], the Isaac Newton Trust, the UK Higher Education Innovation Fund (HEIF), Thales E-Security, and Google, Inc.This is the author accepted manuscript. The final version of the article can be found at: http://ieeexplore.ieee.org/document/7723791

UCL Discovery

Apollo (Cambridge)

Recommended from our members

CheriABI: Enforcing Valid Pointer Provenance and Minimizing Pointer Privilege in the POSIX C Run-time Environment

Author: Baldwin J
Chisnall D
Clarke J
Davis B
Filardo NW
Gudka K
Joannou A
Laurie B
Markettos AT
Maste JE
Mazzinghi A
Moore SW
Napierala ET
Neumann PG
Norton RM
Richardson A
Roe M
Sewell P
Son S
Watson RNM
Woodruff J
Publication venue: International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS
Publication date: 01/01/2019
Field of study

The CHERI architecture allows pointers to be implemented as capabilities (rather than integer virtual addresses) in a manner that is compatible with, and strengthens, the semantics of the C language. In addition to the spatial protections offered by conventional fat pointers, CHERI capabilities offer strong integrity, enforced provenance validity, and access monotonicity. The stronger guarantees of these architectural capabilities must be reconciled with the real-world behavior of operating systems, run-time environments, and applications. When the process model, user-kernel interactions, dynamic linking, and memory management are all considered, we observe that simple derivation of architectural capabilities is insufficient to describe appropriate access to memory. We bridge this conceptual gap with a notional \emph{abstract capability} that describes the accesses that should be allowed at a given point in execution, whether in the kernel or userspace. To investigate this notion at scale, we describe the first adaptation of a full C-language operating system (FreeBSD) with an enterprise database (PostgreSQL) for complete spatial and referential memory safety. We show that awareness of abstract capabilities, coupled with CHERI architectural capabilities, can provide more complete protection, strong compatibility, and acceptable performance overhead compared with the pre-CHERI baseline and software-only approaches. Our observations also have potentially significant implications for other mitigation techniques.This work was supported by the Defense Advanced Research Projects Agency (DARPA) and the Air Force Research Laboratory (AFRL), under contracts FA8750-10-C-0237 (``CTSRD'') and HR0011-18-C-0016 (``ECATS''). The views, opinions, and/or findings contained in this report are those of the authors and should not be interpreted as representing the official views or policies of the Department of Defense or the U.S. Government. We also acknowledge the EPSRC REMS Programme Grant (EP/K008528/1), the ERC ELVER Advanced Grant (789108), Arm Limited, HP Enterprise, and Google, Inc. Approved for Public Release, Distribution Unlimited

Apollo (Cambridge)

Recommended from our members

Inside risks through computer architecture, Darkly

Author: Markettos AT
Moore SW
Neumann PG
Sewell P
Watson RNM
Publication venue: Communications of the ACM
Publication date: 04/04/2023
Field of study

Total-system hardware and microarchitectural issues are becoming increasingly critical

Apollo (Cambridge)

Recommended from our members

General hardware multicasting for fine-grained message-passing architectures

Author: Beaumont JR
Brown A
Bytheway Thomas
Fleming S
Markettos AT
Moore Simon
Naylor Matthew
Thomas D
Vousden M
Publication venue: Proceedings - 29th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2021
Publication date: 01/01/2021
Field of study

Apollo (Cambridge)

Termination detection for fine-grained message-passing architectures

Author: Beaumont J
Brown A
Bytheway T
Fleming S
Markettos AT
Mokhov A
Moore S
Naylor M
Thomas D
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Newcastle University E-Prints